144 research outputs found

    Parallelization of the AVL FIRE Benchmark with SVM-Fortran

    Get PDF
    This article outlines the parallelization of an irregular grid application with SVM-Fortran. It describes the different optimizations and their effectiveness. The parallelization was much simplified by the performance analysis tool OPAL, a source code based tool for requesting and analyzing runtime performance data. Although shared memory parallelization is easier than distributed memory parallelization, understanding and eliminating the overhead from page faults is impossible without such a tool. It relates the page faults to the arrays and to the location in the source code. An area which is not supported by OPAL but where supporting tools are highly desirable, is the performance degradation due to low utilization of the on-chip cache

    Distribution of Periscope Analysis Agents on ALTIX 4700

    Get PDF

    Designing an Adaptive Application-Level Checkpoint Management System for Malleable MPI Applications

    Full text link
    Dynamic resource management opens up numerous opportunities in High Performance Computing. It improves the system-level services as well as application performance. Checkpointing can also be deemed as a system-level service and can reap the benefits offered by dynamism. A checkpointing system can have better resource availability by integrating with a malleable resource management system. In addition to fault tolerance, the checkpointing system can cater to the data redistribution demand of malleable applications during resource change. Therefore, we propose iCheck, an adaptive application-level checkpoint management system that can efficiently utilize the system and application level dynamism to provide better checkpointing and data redistribution services to applications.Comment: Third International Symposium on Checkpointing for Supercomputing (SuperCheck-SC22

    Scalability and Performance Analysis of OpenMP Codes Using the Periscope Toolkit

    Get PDF
    In this paper, we present two new approaches while rendering necessary extensions to Periscope to perform scalability and performance analysis on OpenMP codes. Periscope is an online-based performance analysis toolkit which consists of a user defined number of analysis agents that automatically search for the performance properties while the application is running. In order to detect the scalability and performance bottlenecks of OpenMP codes using Periscope, a few newly defined performance properties and meta properties are formalized. We manifest our implementation by evaluating NAS OpenMP benchmarks. As shown in our results, our approach identifies the code regions which do not scale well and other performance problems, e.g. load imbalance in NAS parallel benchmarks

    SVM Support in the Vienna Fortran Compilation System

    Get PDF
    Vienna Fortran, a machine-independent language extension to Fortran which allows the user to write programs for distributed-memory systems using global addresses, provides the forall-loop construct for specifying irregular computations that do not cause inter-iteration dependences. Compilers for distributed-memory systems generate code that is based on runtime analysis techniques and is only efficient if, in addition, aggressive compile-time optimizations are applied. Since these optimizations are difficult to perform we propose to generate shared virtual memory code instead that can benefit from appropriate operating system or hardware support. This paper presents the shared virtual memory code generation, compares both approaches and gives first performance results

    Zum Stigma des Sozialschmarotzers

    Get PDF

    A Comparison of two Parallelization Strategies for TRACE

    Get PDF
    In this report we compare two different methods of parallelization of a finite element code describing water flow in soils. The first method uses Domain Decomposition based on a parallel Schwarz algorithm. The second method uses a Data Partitioning approach pursued in High Performance Fortran (HPF). Experiments with the parallel versions were performed on the Paragon XP/S 10 at KFA
    corecore